Comparison of record linkage methods
نویسندگان
چکیده
Record linkage is an important tool to enhance database integration. This even more valuable in a scenario with hefty budget cuts and growing drop response rate traditional surveys. strategy makes it possible expand the crossing alternatives variables not present original base. However, there are many different data pairing methods exposed literature. In this sense, objective of paper compare well-known record linkage. The comparison was made synthetic dataset. To methods, adopted quantitative approach based on Precision, Recall, F-Statistics metrics, using two functions: Levenshtein Jaro-Winkler. Among six types classifiers analyzed, supervised had best results.
منابع مشابه
A Comparison of Blocking Methods for Record Linkage
Record linkage seeks to merge databases and to remove duplicates when unique identifiers are not available. Most approaches use blocking techniques to reduce the computational complexity associated with record linkage. We review traditional blocking techniques, which typically partition the records according to a set of field attributes, and consider two variants of a method known as locality s...
متن کاملA Comparison of Fast Blocking Methods for Record Linkage
Blocking methods are used in record linkage systems to reduce the number of candidate record comparison pairs to a feasible number whilst still maintaining linkage accuracy. Blocking methods partition the data sets into blocks or clusters of records which share a blocking attribute or are otherwise similar with respect to a defined criterion. We compare two new blocking methods, bigram indexing...
متن کاملAdvanced Methods for Record Linkage 940920
Record linkage, or computer matching, is needed for the creation and maintenance of name and address lists that support operations for and evaluations of a Year 2000 Census. This paper describes three advances. The first is an enhanced method of string comparison for dealing with typographical variations and scanning errors. It improves upon string comparators in computer science. The second is...
متن کاملGrouping methods for ongoing record linkage
The grouping of record-pairs to determine which records belong to the same individual is an important part of the record linkage process. While a merge grouping approach is commonly used, other methods may be more appropriate when linking to a repository of previously linked data. In this paper, we applied a number of grouping strategies to three large scale hospital datasets (comprising around...
متن کاملSome methods for blindfolded record linkage
BACKGROUND The linkage of records which refer to the same entity in separate data collections is a common requirement in public health and biomedical research. Traditionally, record linkage techniques have required that all the identifying data in which links are sought be revealed to at least one party, often a third party. This necessarily invades personal privacy and requires complete trust ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: GeSec
سال: 2023
ISSN: ['2178-9010']
DOI: https://doi.org/10.7769/gesec.v14i5.2171